Optimal SNR Model Selection in Multiple-Model Based Speech Recognition System

نویسنده

  • Yongjoo Chung
چکیده

In the multiple-model based speech recognition system, multiple HMM models corresponding to different types of noise signals and SNR values are trained and the one model which is most close to the input speech is selected for recognition. In the previous research on the multiplemodel based speech recognition, it has been thought that the best performance can be obtained by selecting the HMM model which is most similar in SNR values to the input speech. But, from our experimental results, it has been found that better performance can be obtained when there is some mismatch between the SNR values of input speech and the selected HMM model. In this paper, we experimentally determined the optimal HMM models corresponding to the SNR values of the input speech in the multiple-model based speech recognizer. From the recognition experiments on Aurora 2 database, we could see far better recognition results compared with the conventional multiple-model based speech recognizer by using the experimentally determined optimal HMM models.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Compensation of SNR and noise type mismatch using an environmental sniffing based speech recognition solution

Multiple-model based speech recognition (MMSR) has been shown to be quite successful in noisy speech recognition. Since it employs multiple hidden Markov model (HMM) sets that correspond to various noise types and signal-to-noise ratio (SNR) values, the selected acoustic model can be closely matched with the test noisy speech, which leads to improved performance when compared with other state-o...

متن کامل

Analysis and Optimization of Telephone Speech Command Recognition System Performance in Noisy Environment

This paper deals with the analysis and optimization of a speech command recognition system (SCRS) trained on Czech telephone database Speechdat(E) for use in a selected noisy environment. The SCRS is based on hidden Markov models of context dependent phones (triphones) and mel-frequency cepstral coefficients analysis of speech (MFCC). The main aim is to analyze and to search for the optimal set...

متن کامل

Improving of Feature Selection in Speech Emotion Recognition Based-on Hybrid Evolutionary Algorithms

One of the important issues in speech emotion recognizing is selecting of appropriate feature sets in order to improve the detection rate and classification accuracy. In last studies researchers tried to select the appropriate features for classification by using the selecting and reducing the space of features methods, such as the Fisher and PCA. In this research, a hybrid evolutionary algorit...

متن کامل

Lip-reading from parametric lip contours for audio- visual speech recognition

This paper describes the incorporation of a visual lip tracking and lip-reading algorithm that utilizes the affine-invariant Fourier descriptors from parametric lip contours to improve the audio-visual speech recognition systems. The audio-visual speech recognition system presented here uses parallel hidden Markov models (HMMs), where a joint decision, using an optimal decision rule, is made af...

متن کامل

Speech Emotion Recognition Based on Power Normalized Cepstral Coefficients in Noisy Conditions

Automatic recognition of speech emotional states in noisy conditions has become an important research topic in the emotional speech recognition area, in recent years. This paper considers the recognition of emotional states via speech in real environments. For this task, we employ the power normalized cepstral coefficients (PNCC) in a speech emotion recognition system. We investigate its perfor...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2012